Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform

نویسندگان

  • Zhaojie Luo
  • Tetsuya Takiguchi
  • Yasuo Ariki
چکیده

An artificial neural network is one of the most important models for training features of voice conversion (VC) tasks. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as mel cepstral coefficients (MCC) which represent the spectrum features. However, a simple representation for fundamental frequency (F0) is not enough for neural networks to deal with an emotional voice, because the time sequence of F0 for an emotional voice changes drastically. Therefore, in this paper, we propose an effective method that uses the continuous wavelet transform (CWT) to decompose F0 into different temporal scales that can be well trained by NNs for prosody modeling in emotional voice conversion. Meanwhile, the proposed method uses deep belief networks (DBNs) to pretrain the NNs that convert spectral features. By utilizing these approaches, the proposed method can change the spectrum and the prosody for an emotional voice at the same time, and was able to outperform other state-of-the-art methods for emotional voice conversion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform

An artificial neural network is an important model for training features of voice conversion (VC) tasks. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as Mel Cepstral Coefficients (MCC), which represent the spectrum features. However, a simple representation of fundamental frequency (F0) is not enough for NNs to deal with emotional voice VC. This is ...

متن کامل

Emotional Voice Conversion with Adaptive Scales F0 Based on Wavelet Transform Using Limited Amount of Emotional Data

Deep learning techniques have been successfully applied to speech processing. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as mel cepstral coefficients (MCC), which represent the spectrum features in voice conversion (VC) tasks. Despite these successes, the approach is restricted to problems with moderate dimension and sufficient data. Thus, in emot...

متن کامل

Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion

Emotional voice conversion aims at converting speech from one emotion state to another. This paper proposes to model timbre and prosody features using a deep bidirectional long shortterm memory (DBLSTM) for emotional voice conversion. A continuous wavelet transform (CWT) representation of fundamental frequency (F0) and energy contour are used for prosody modeling. Specifically, we use CWT to de...

متن کامل

Using PCA with LVQ, RBF, MLP, SOM and Continuous Wavelet Transform for Fault Diagnosis of Gearboxes

A new method based on principal component analysis (PCA) and artificial neural networks (ANN) is proposed for fault diagnosis of gearboxes. Firstly the six different base wavelets are considered, in which three are from real valued and other three from complex valued. Two wavelet selection criteria Maximum Energy to Shannon Entropy ratio and Maximum Relative Wavelet Energy are used and compared...

متن کامل

AN INTELLIGENT FAULT DIAGNOSIS APPROACH FOR GEARS AND BEARINGS BASED ON WAVELET TRANSFORM AS A PREPROCESSOR AND ARTIFICIAL NEURAL NETWORKS

In this paper, a fault diagnosis system based on discrete wavelet transform (DWT) and artificial neural networks (ANNs) is designed to diagnose different types of fault in gears and bearings. DWT is an advanced signal-processing technique for fault detection and identification. Five features of wavelet transform RMS, crest factor, kurtosis, standard deviation and skewness of discrete wavelet co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016